{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<i>Copyright (c) Microsoft Corporation. All rights reserved.</i>\n",
    "\n",
    "<i>Licensed under the MIT License.</i>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# LightGCN - simplified GCN model for recommendation\n",
    "\n",
    "This notebook serves as an introduction to LightGCN, which is an simple, linear and neat Graph Convolution Network (GCN) model for recommendation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0 Global Settings and Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "System version: 3.6.11 | packaged by conda-forge | (default, Aug  5 2020, 20:09:42) \n",
      "[GCC 7.5.0]\n",
      "Pandas version: 0.25.3\n",
      "Tensorflow version: 1.15.2\n"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "sys.path.append(\"../../\")\n",
    "import os\n",
    "import papermill as pm\n",
    "import scrapbook as sb\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "tf.get_logger().setLevel('ERROR') # only show error messages\n",
    "\n",
    "from reco_utils.common.timer import Timer\n",
    "from reco_utils.recommender.deeprec.models.graphrec.lightgcn import LightGCN\n",
    "from reco_utils.recommender.deeprec.DataModel.ImplicitCF import ImplicitCF\n",
    "from reco_utils.dataset import movielens\n",
    "from reco_utils.dataset.python_splitters import python_stratified_split\n",
    "from reco_utils.evaluation.python_evaluation import map_at_k, ndcg_at_k, precision_at_k, recall_at_k\n",
    "from reco_utils.common.constants import SEED as DEFAULT_SEED\n",
    "from reco_utils.recommender.deeprec.deeprec_utils import prepare_hparams\n",
    "\n",
    "print(\"System version: {}\".format(sys.version))\n",
    "print(\"Pandas version: {}\".format(pd.__version__))\n",
    "print(\"Tensorflow version: {}\".format(tf.__version__))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "tags": [
     "parameters"
    ]
   },
   "outputs": [],
   "source": [
    "# top k items to recommend\n",
    "TOP_K = 10\n",
    "\n",
    "# Select MovieLens data size: 100k, 1m, 10m, or 20m\n",
    "MOVIELENS_DATA_SIZE = '100k'\n",
    "\n",
    "# Model parameters\n",
    "EPOCHS = 50\n",
    "BATCH_SIZE = 1024\n",
    "\n",
    "SEED = DEFAULT_SEED  # Set None for non-deterministic results\n",
    "\n",
    "yaml_file = \"../../reco_utils/recommender/deeprec/config/lightgcn.yaml\"\n",
    "user_file = \"../../tests/resources/deeprec/lightgcn/user_embeddings.csv\"\n",
    "item_file = \"../../tests/resources/deeprec/lightgcn/item_embeddings.csv\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1 LightGCN model architecture\n",
    "\n",
    "LightGCN is a simplified design of GCN to make it more concise and appropriate for recommendation. The model architecture is illustrated below.\n",
    "\n",
    "<img src=\"https://recodatasets.z20.web.core.windows.net/images/lightGCN-model.jpg\">\n",
    "\n",
    "In Light Graph Convolution, only the normalized sum of neighbor embeddings is performed towards next layer; other operations like self-connection, feature transformation, and nonlinear activation are all removed, which largely simplifies GCNs. In Layer Combination, we sum over the embeddings at each layer to obtain the final representations.\n",
    "\n",
    "### 1.1 Light Graph Convolution (LGC)\n",
    "\n",
    "In LightGCN, we adopt the simple weighted sum aggregator and abandon the use of feature transformation and nonlinear activation. The graph convolution operation in LightGCN is defined as:\n",
    "\n",
    "$$\n",
    "\\begin{array}{l}\n",
    "\\mathbf{e}_{u}^{(k+1)}=\\sum_{i \\in \\mathcal{N}_{u}} \\frac{1}{\\sqrt{\\left|\\mathcal{N}_{u}\\right|} \\sqrt{\\left|\\mathcal{N}_{i}\\right|}} \\mathbf{e}_{i}^{(k)} \\\\\n",
    "\\mathbf{e}_{i}^{(k+1)}=\\sum_{u \\in \\mathcal{N}_{i}} \\frac{1}{\\sqrt{\\left|\\mathcal{N}_{i}\\right|} \\sqrt{\\left|\\mathcal{N}_{u}\\right|}} \\mathbf{e}_{u}^{(k)}\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "The symmetric normalization term $\\frac{1}{\\sqrt{\\left|\\mathcal{N}_{u}\\right|} \\sqrt{\\left|\\mathcal{N}_{i}\\right|}}$ follows the design of standard GCN, which can avoid the scale of embeddings increasing with graph convolution operations.\n",
    "\n",
    "\n",
    "### 1.2 Layer Combination and Model Prediction\n",
    "\n",
    "In LightGCN, the only trainable model parameters are the embeddings at the 0-th layer, i.e., $\\mathbf{e}_{u}^{(0)}$ for all users and $\\mathbf{e}_{i}^{(0)}$ for all items. When they are given, the embeddings at higher layers can be computed via LGC. After $K$ layers LGC, we further combine the embeddings obtained at each layer to form the final representation of a user (an item):\n",
    "\n",
    "$$\n",
    "\\mathbf{e}_{u}=\\sum_{k=0}^{K} \\alpha_{k} \\mathbf{e}_{u}^{(k)} ; \\quad \\mathbf{e}_{i}=\\sum_{k=0}^{K} \\alpha_{k} \\mathbf{e}_{i}^{(k)}\n",
    "$$\n",
    "\n",
    "where $\\alpha_{k} \\geq 0$ denotes the importance of the $k$-th layer embedding in constituting the final embedding. In our experiments, we set $\\alpha_{k}$ uniformly as $1 / (K+1)$.\n",
    "\n",
    "The model prediction is defined as the inner product of user and item final representations:\n",
    "\n",
    "$$\n",
    "\\hat{y}_{u i}=\\mathbf{e}_{u}^{T} \\mathbf{e}_{i}\n",
    "$$\n",
    "\n",
    "which is used as the ranking score for recommendation generation.\n",
    "\n",
    "\n",
    "### 1.3 Matrix Form\n",
    "\n",
    "Let the user-item interaction matrix be $\\mathbf{R} \\in \\mathbb{R}^{M \\times N}$ where $M$ and $N$ denote the number of users and items, respectively, and each entry $R_{ui}$ is 1 if $u$ has interacted with item $i$ otherwise 0. We then obtain the adjacency matrix of the user-item graph as\n",
    "\n",
    "$$\n",
    "\\mathbf{A}=\\left(\\begin{array}{cc}\n",
    "\\mathbf{0} & \\mathbf{R} \\\\\n",
    "\\mathbf{R}^{T} & \\mathbf{0}\n",
    "\\end{array}\\right)\n",
    "$$\n",
    "\n",
    "Let the 0-th layer embedding matrix be $\\mathbf{E}^{(0)} \\in \\mathbb{R}^{(M+N) \\times T}$, where $T$ is the embedding size. Then we can obtain the matrix equivalent form of LGC as:\n",
    "\n",
    "$$\n",
    "\\mathbf{E}^{(k+1)}=\\left(\\mathbf{D}^{-\\frac{1}{2}} \\mathbf{A} \\mathbf{D}^{-\\frac{1}{2}}\\right) \\mathbf{E}^{(k)}\n",
    "$$\n",
    "\n",
    "where $\\mathbf{D}$ is a $(M+N) \\times(M+N)$ diagonal matrix, in which each entry $D_{ii}$ denotes the number of nonzero entries in the $i$-th row vector of the adjacency matrix $\\mathbf{A}$ (also named as degree matrix). Lastly, we get the final embedding matrix used for model prediction as:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\mathbf{E} &=\\alpha_{0} \\mathbf{E}^{(0)}+\\alpha_{1} \\mathbf{E}^{(1)}+\\alpha_{2} \\mathbf{E}^{(2)}+\\ldots+\\alpha_{K} \\mathbf{E}^{(K)} \\\\\n",
    "&=\\alpha_{0} \\mathbf{E}^{(0)}+\\alpha_{1} \\tilde{\\mathbf{A}} \\mathbf{E}^{(0)}+\\alpha_{2} \\tilde{\\mathbf{A}}^{2} \\mathbf{E}^{(0)}+\\ldots+\\alpha_{K} \\tilde{\\mathbf{A}}^{K} \\mathbf{E}^{(0)}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "where $\\tilde{\\mathbf{A}}=\\mathbf{D}^{-\\frac{1}{2}} \\mathbf{A} \\mathbf{D}^{-\\frac{1}{2}}$ is the symmetrically normalized matrix.\n",
    "\n",
    "### 1.4 Model Training\n",
    "\n",
    "We employ the Bayesian Personalized Ranking (BPR) loss which is a pairwise loss that encourages the prediction of an observed entry to be higher than its unobserved counterparts:\n",
    "\n",
    "$$\n",
    "L_{B P R}=-\\sum_{u=1}^{M} \\sum_{i \\in \\mathcal{N}_{u}} \\sum_{j \\notin \\mathcal{N}_{u}} \\ln \\sigma\\left(\\hat{y}_{u i}-\\hat{y}_{u j}\\right)+\\lambda\\left\\|\\mathbf{E}^{(0)}\\right\\|^{2}\n",
    "$$\n",
    "\n",
    "Where $\\lambda$ controls the $L_2$ regularization strength. We employ the Adam optimizer and use it in a mini-batch manner.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2 TensorFlow implementation of LightGCN\n",
    "\n",
    "We will use the MovieLens dataset, which is composed of integer ratings from 1 to 5.\n",
    "\n",
    "We convert MovieLens into implicit feedback for model training and evaluation.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3 TensorFlow LightGCN movie recommender\n",
    "\n",
    "### 3.1 Load and split data\n",
    "\n",
    "We split the full dataset into a `train` and `test` dataset to evaluate performance of the algorithm against a held-out set not seen during training. Because SAR generates recommendations based on user preferences, all users that are in the test set must also exist in the training set. For this case, we can use the provided `python_stratified_split` function which holds out a percentage (in this case 25%) of items from each user, but ensures all users are in both `train` and `test` datasets. Other options are available in the `dataset.python_splitters` module which provide more control over how the split occurs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 4.81k/4.81k [00:01<00:00, 4.50kKB/s]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>userID</th>\n",
       "      <th>itemID</th>\n",
       "      <th>rating</th>\n",
       "      <th>timestamp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>196</td>\n",
       "      <td>242</td>\n",
       "      <td>3.0</td>\n",
       "      <td>881250949</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>186</td>\n",
       "      <td>302</td>\n",
       "      <td>3.0</td>\n",
       "      <td>891717742</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>22</td>\n",
       "      <td>377</td>\n",
       "      <td>1.0</td>\n",
       "      <td>878887116</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>244</td>\n",
       "      <td>51</td>\n",
       "      <td>2.0</td>\n",
       "      <td>880606923</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>166</td>\n",
       "      <td>346</td>\n",
       "      <td>1.0</td>\n",
       "      <td>886397596</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   userID  itemID  rating  timestamp\n",
       "0     196     242     3.0  881250949\n",
       "1     186     302     3.0  891717742\n",
       "2      22     377     1.0  878887116\n",
       "3     244      51     2.0  880606923\n",
       "4     166     346     1.0  886397596"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = movielens.load_pandas_df(size=MOVIELENS_DATA_SIZE)\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "train, test = python_stratified_split(df, ratio=0.75)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Process data\n",
    "\n",
    "`ImplicitCF` is a class that intializes and loads data for the training process. During the initialization of this class, user IDs and item IDs are reindexed, ratings greater than zero are converted into implicit positive interaction, and adjacency matrix $R$ of user-item graph is created. Some important methods of `ImplicitCF` are:\n",
    "\n",
    "`get_norm_adj_mat`, load normalized adjacency matrix of user-item graph if it already exists in `adj_dir`, otherwise call `create_norm_adj_mat` to create the matrix and save the matrix if `adj_dir` is not `None`. This method will be called during the initialization process of LightGCN model.\n",
    "\n",
    "`create_norm_adj_mat`, create normalized adjacency matrix of user-item graph by calculating $D^{-\\frac{1}{2}} A D^{-\\frac{1}{2}}$, where $\\mathbf{A}=\\left(\\begin{array}{cc}\\mathbf{0} & \\mathbf{R} \\\\ \\mathbf{R}^{T} & \\mathbf{0}\\end{array}\\right)$.\n",
    "\n",
    "`train_loader`, generate a batch of training data — sample a batch of users and then sample one positive item and one negative item for each user. This method will be called before each epoch of the training process.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = ImplicitCF(train=train, test=test, seed=SEED)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.3 Prepare hyper-parameters\n",
    "\n",
    "Important parameters of `LightGCN` model are:\n",
    "\n",
    "`data`, initialized LightGCNDataset object.\n",
    "\n",
    "`epochs`, number of epochs for training.\n",
    "\n",
    "`n_layers`, number of layers of the model.\n",
    "\n",
    "`eval_epoch`, if it is not None, evaluation metrics will be calculated on test set every \"eval_epoch\" epochs. In this way, we can observe the effect of the model during the training process.\n",
    "\n",
    "`top_k`, the number of items to be recommended for each user when calculating ranking metrics.\n",
    "\n",
    "A complete list of parameters can be found in `yaml_file`. We use `prepare_hparams` to read the yaml file and prepare a full set of parameters for the model. Parameters passed as the function's parameters will overwrite yaml settings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "hparams = prepare_hparams(yaml_file,\n",
    "                          n_layers=3,\n",
    "                          batch_size=BATCH_SIZE,\n",
    "                          epochs=EPOCHS,\n",
    "                          learning_rate=0.005,\n",
    "                          eval_epoch=5,\n",
    "                          top_k=TOP_K,\n",
    "                         )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.4 Create and train model\n",
    "\n",
    "With data and parameters prepared, we can create the LightGCN model.\n",
    "\n",
    "To train the model, we simply need to call the `fit()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Already create adjacency matrix.\n",
      "Already normalize adjacency matrix.\n",
      "Using xavier initialization.\n"
     ]
    }
   ],
   "source": [
    "model = LightGCN(hparams, data, seed=SEED)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1 (train)1.4s: train loss = 0.47162 = (mf)0.47137 + (embed)0.00025\n",
      "Epoch 2 (train)0.9s: train loss = 0.26998 = (mf)0.26926 + (embed)0.00072\n",
      "Epoch 3 (train)0.9s: train loss = 0.24061 = (mf)0.23969 + (embed)0.00093\n",
      "Epoch 4 (train)0.9s: train loss = 0.22621 = (mf)0.22512 + (embed)0.00109\n",
      "Epoch 5 (train)0.8s + (eval)0.4s: train loss = 0.22219 = (mf)0.22098 + (embed)0.00121, recall = 0.16303, ndcg = 0.35414, precision = 0.30817, map = 0.09479\n",
      "Epoch 6 (train)0.7s: train loss = 0.21284 = (mf)0.21152 + (embed)0.00131\n",
      "Epoch 7 (train)0.8s: train loss = 0.20306 = (mf)0.20162 + (embed)0.00144\n",
      "Epoch 8 (train)0.7s: train loss = 0.19611 = (mf)0.19454 + (embed)0.00157\n",
      "Epoch 9 (train)0.8s: train loss = 0.18534 = (mf)0.18364 + (embed)0.00170\n",
      "Epoch 10 (train)0.8s + (eval)0.2s: train loss = 0.18289 = (mf)0.18104 + (embed)0.00185, recall = 0.17857, ndcg = 0.38568, precision = 0.33669, map = 0.10658\n",
      "Epoch 11 (train)0.9s: train loss = 0.17330 = (mf)0.17132 + (embed)0.00198\n",
      "Epoch 12 (train)0.9s: train loss = 0.16902 = (mf)0.16691 + (embed)0.00211\n",
      "Epoch 13 (train)0.8s: train loss = 0.16659 = (mf)0.16437 + (embed)0.00222\n",
      "Epoch 14 (train)0.8s: train loss = 0.16300 = (mf)0.16066 + (embed)0.00233\n",
      "Epoch 15 (train)0.8s + (eval)0.2s: train loss = 0.16449 = (mf)0.16207 + (embed)0.00242, recall = 0.18845, ndcg = 0.39969, precision = 0.35090, map = 0.11312\n",
      "Epoch 16 (train)0.8s: train loss = 0.15979 = (mf)0.15729 + (embed)0.00250\n",
      "Epoch 17 (train)0.8s: train loss = 0.15828 = (mf)0.15568 + (embed)0.00259\n",
      "Epoch 18 (train)0.9s: train loss = 0.15787 = (mf)0.15519 + (embed)0.00268\n",
      "Epoch 19 (train)0.9s: train loss = 0.15500 = (mf)0.15224 + (embed)0.00276\n",
      "Epoch 20 (train)1.0s + (eval)0.3s: train loss = 0.15026 = (mf)0.14740 + (embed)0.00286, recall = 0.19753, ndcg = 0.42133, precision = 0.36723, map = 0.12255\n",
      "Epoch 21 (train)1.0s: train loss = 0.14811 = (mf)0.14515 + (embed)0.00296\n",
      "Epoch 22 (train)0.9s: train loss = 0.14676 = (mf)0.14370 + (embed)0.00306\n",
      "Epoch 23 (train)0.9s: train loss = 0.14456 = (mf)0.14139 + (embed)0.00317\n",
      "Epoch 24 (train)0.8s: train loss = 0.14317 = (mf)0.13991 + (embed)0.00326\n",
      "Epoch 25 (train)0.9s + (eval)0.3s: train loss = 0.14045 = (mf)0.13708 + (embed)0.00337, recall = 0.20329, ndcg = 0.42989, precision = 0.37731, map = 0.12708\n",
      "Epoch 26 (train)0.9s: train loss = 0.13827 = (mf)0.13478 + (embed)0.00349\n",
      "Epoch 27 (train)0.8s: train loss = 0.13663 = (mf)0.13304 + (embed)0.00359\n",
      "Epoch 28 (train)0.9s: train loss = 0.13429 = (mf)0.13059 + (embed)0.00370\n",
      "Epoch 29 (train)1.0s: train loss = 0.13277 = (mf)0.12896 + (embed)0.00381\n",
      "Epoch 30 (train)0.9s + (eval)0.2s: train loss = 0.12975 = (mf)0.12583 + (embed)0.00393, recall = 0.20393, ndcg = 0.43713, precision = 0.38282, map = 0.12933\n",
      "Epoch 31 (train)1.0s: train loss = 0.12694 = (mf)0.12289 + (embed)0.00405\n",
      "Epoch 32 (train)1.0s: train loss = 0.12595 = (mf)0.12178 + (embed)0.00417\n",
      "Epoch 33 (train)1.0s: train loss = 0.12526 = (mf)0.12098 + (embed)0.00428\n",
      "Epoch 34 (train)0.9s: train loss = 0.12110 = (mf)0.11669 + (embed)0.00441\n",
      "Epoch 35 (train)0.9s + (eval)0.2s: train loss = 0.12038 = (mf)0.11584 + (embed)0.00454, recall = 0.20941, ndcg = 0.44525, precision = 0.38865, map = 0.13433\n",
      "Epoch 36 (train)0.9s: train loss = 0.11915 = (mf)0.11450 + (embed)0.00465\n",
      "Epoch 37 (train)0.8s: train loss = 0.12037 = (mf)0.11561 + (embed)0.00476\n",
      "Epoch 38 (train)0.9s: train loss = 0.11479 = (mf)0.10990 + (embed)0.00489\n",
      "Epoch 39 (train)0.9s: train loss = 0.11375 = (mf)0.10873 + (embed)0.00502\n",
      "Epoch 40 (train)0.9s + (eval)0.3s: train loss = 0.11331 = (mf)0.10817 + (embed)0.00514, recall = 0.21353, ndcg = 0.45133, precision = 0.39597, map = 0.13544\n",
      "Epoch 41 (train)0.9s: train loss = 0.11028 = (mf)0.10503 + (embed)0.00525\n",
      "Epoch 42 (train)0.9s: train loss = 0.11127 = (mf)0.10588 + (embed)0.00539\n",
      "Epoch 43 (train)0.9s: train loss = 0.10781 = (mf)0.10232 + (embed)0.00549\n",
      "Epoch 44 (train)0.9s: train loss = 0.10583 = (mf)0.10021 + (embed)0.00562\n",
      "Epoch 45 (train)0.9s + (eval)0.2s: train loss = 0.10706 = (mf)0.10133 + (embed)0.00573, recall = 0.21276, ndcg = 0.45422, precision = 0.39799, map = 0.13683\n",
      "Epoch 46 (train)0.9s: train loss = 0.10515 = (mf)0.09929 + (embed)0.00586\n",
      "Epoch 47 (train)0.9s: train loss = 0.10151 = (mf)0.09552 + (embed)0.00599\n",
      "Epoch 48 (train)0.9s: train loss = 0.10157 = (mf)0.09544 + (embed)0.00613\n",
      "Epoch 49 (train)0.9s: train loss = 0.10073 = (mf)0.09449 + (embed)0.00624\n",
      "Epoch 50 (train)1.0s + (eval)0.2s: train loss = 0.10069 = (mf)0.09432 + (embed)0.00636, recall = 0.21320, ndcg = 0.45300, precision = 0.39926, map = 0.13531\n",
      "Took 47.25114500336349 seconds for training.\n"
     ]
    }
   ],
   "source": [
    "with Timer() as train_time:\n",
    "    model.fit()\n",
    "\n",
    "print(\"Took {} seconds for training.\".format(train_time.interval))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.4 Recommendation and Evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recommendation and evaluation have been performed on the specified test set during training. After training, we can also use the model to perform recommendation and evalution on other data. Here we still use `test` as test data, but `test` can be replaced by other data with similar data structure."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3.4.1 Recommendation\n",
    "\n",
    "We can call `recommend_k_items` to recommend k items for each user passed in this function. We set `remove_seen=True` to remove the items already seen by the user. The function returns a dataframe, containing each user and top k items recommended to them and the corresponding ranking scores."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>userID</th>\n",
       "      <th>itemID</th>\n",
       "      <th>prediction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "      <td>5.597007</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>475</td>\n",
       "      <td>5.193120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>89</td>\n",
       "      <td>5.147971</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>210</td>\n",
       "      <td>5.133258</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>127</td>\n",
       "      <td>5.024765</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   userID  itemID  prediction\n",
       "0       1       7    5.597007\n",
       "1       1     475    5.193120\n",
       "2       1      89    5.147971\n",
       "3       1     210    5.133258\n",
       "4       1     127    5.024765"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "topk_scores = model.recommend_k_items(test, top_k=TOP_K, remove_seen=True)\n",
    "\n",
    "topk_scores.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3.4.2 Evaluation\n",
    "\n",
    "With `topk_scores` predicted by the model, we can evaluate how LightGCN performs on this test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MAP:\t0.135305\n",
      "NDCG:\t0.453004\n",
      "Precision@K:\t0.399258\n",
      "Recall@K:\t0.213203\n"
     ]
    }
   ],
   "source": [
    "eval_map = map_at_k(test, topk_scores, k=TOP_K)\n",
    "eval_ndcg = ndcg_at_k(test, topk_scores, k=TOP_K)\n",
    "eval_precision = precision_at_k(test, topk_scores, k=TOP_K)\n",
    "eval_recall = recall_at_k(test, topk_scores, k=TOP_K)\n",
    "\n",
    "print(\"MAP:\\t%f\" % eval_map,\n",
    "      \"NDCG:\\t%f\" % eval_ndcg,\n",
    "      \"Precision@K:\\t%f\" % eval_precision,\n",
    "      \"Recall@K:\\t%f\" % eval_recall, sep='\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.13530545945796316,
       "encoder": "json",
       "name": "map",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "map"
      }
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.45300447729353804,
       "encoder": "json",
       "name": "ndcg",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "ndcg"
      }
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.3992576882290562,
       "encoder": "json",
       "name": "precision",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "precision"
      }
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/scrapbook.scrap.json+json": {
       "data": 0.2132025994579782,
       "encoder": "json",
       "name": "recall",
       "version": 1
      }
     },
     "metadata": {
      "scrapbook": {
       "data": true,
       "display": false,
       "name": "recall"
      }
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Record results with papermill for tests\n",
    "sb.glue(\"map\", eval_map)\n",
    "sb.glue(\"ndcg\", eval_ndcg)\n",
    "sb.glue(\"precision\", eval_precision)\n",
    "sb.glue(\"recall\", eval_recall)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.5 Infer embeddings\n",
    "\n",
    "With `infer_embedding` method of LightGCN model, we can export the embeddings of users and items in the training set to CSV files for future use."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.infer_embedding(user_file, item_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.6 Compare with SAR and NCF\n",
    "\n",
    "Here there are the performances of LightGCN compared to [SAR](../00_quick_start/sar_movielens.ipynb) and [NCF](../00_quick_start/ncf_movielens.ipynb) on MovieLens dataset of 100k and 1m. The method of data loading and splitting is the same as that described above and the GPU used was a GeForce GTX 1080Ti.\n",
    "\n",
    "Settings common to the three models: `epochs=15, seed=42`.\n",
    "\n",
    "Settings for LightGCN: `embed_size=64, n_layers=3, batch_size=1024, decay=0.0001, learning_rate=0.015 `.\n",
    "\n",
    "Settings for SAR: `similarity_type=\"jaccard\", time_decay_coefficient=30, time_now=None, timedecay_formula=True`.\n",
    "\n",
    "Settings for NCF: `n_factors=4, layer_sizes=[16, 8, 4], batch_size=1024, learning_rate=0.001`.\n",
    "\n",
    "| Data Size | Model    | Training time | Recommending time | MAP@10   | nDCG@10  | Precision@10 | Recall@10 |\n",
    "| --------- | -------- | ------------- | ----------------- | -------- | -------- | ------------ | --------- |\n",
    "| 100k      | LightGCN | 27.8865       | 0.6445            | 0.129236 | 0.436297 | 0.381866     | 0.205816  |\n",
    "| 100k      | SAR      | 0.4895        | 0.1144            | 0.110591 | 0.382461 | 0.330753     | 0.176385  |\n",
    "| 100k      | NCF      | 116.3174      | 7.7660            | 0.105725 | 0.387603 | 0.342100     | 0.174580  |\n",
    "| 1m        | LightGCN | 396.7298      | 1.4343            | 0.075012 | 0.377501 | 0.345679     | 0.128096  |\n",
    "| 1m        | SAR      | 4.5593        | 2.8357            | 0.060579 | 0.299245 | 0.270116     | 0.104350  |\n",
    "| 1m        | NCF      | 1601.5846     | 85.4567           | 0.062821 | 0.348770 | 0.320613     | 0.108121  |\n",
    "\n",
    "From the above results, we can see that LightGCN performs better than the other two models."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Reference: \n",
    "1. Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang & Meng Wang, LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation, 2020, https://arxiv.org/abs/2002.02126\n",
    "\n",
    "2. LightGCN implementation [TensorFlow]: https://github.com/kuandeng/lightgcn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Tags",
  "kernelspec": {
   "display_name": "Python (reco_gpu)",
   "language": "python",
   "name": "reco_gpu"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
